Aligning WordNet Synsets and Wikipedia Articles
نویسندگان
چکیده
This paper examines the problem of finding articles in Wikipedia to match noun synsets in WordNet. The motivation is that these articles enrich the synsets with much more information than is already present in WordNet. Two methods are used. The first is title matching, following redirects and disambiguation links. The second is information retrieval over the set of articles. The methods are evaluated over a random sample set of 200 noun synsets which were manually annotated. With 10 candidate articles retrieved for each noun synset, the methods achieve recall of 93%. The manually annotated data set and the automatically generated candidate article sets are available online for research purposes.
منابع مشابه
Aligning Sense Inventories in Wikipedia and WordNet
In this paper, we study the alignment of Wikipedia articles and WordNet synsets. Therefore, we propose a method to convert Wikipedia to a sense inventory. We show that an aligned sense inventory of both resources has two major benefits: the coverage of senses can be increased and enhanced information about aligned senses can be obtained. Our study and conclusions are based on human annotations ...
متن کاملMapping WordNet synsets to Wikipedia articles
Lexical knowledge bases (LKBs), such as WordNet, have been shown to be useful for a range of language processing tasks. Extending these resources is an expensive and time-consuming process. This paper describes an approach to address this problem by automatically generating a mapping from WordNet synsets to Wikipedia articles. A sample of synsets has been manually annotated with article matches...
متن کاملConstructing a class hierarchy with properties by refining and aligning Japanese wikipedia ontology and Japanese WordNet
Introduction We have proposed learning methods for building a large-scale and high accuracy general ontology called Japanese Wikipedia Ontology (JWO) by extracting the concepts and relationships between concepts from various semistructured resources in Japanese Wikipedia [3]. However, JWO has problems because it lacks upper classes and appropriate definitions of properties. Thus, the aim of our...
متن کاملLearning the semantics of Wikipedia hyperlinks
I claim that hyperlinks in Wikipedia entries often correspond to semantic relationships between concepts, described by the entries. This bachelor’s thesis discusses supervised methods to automatically identify new links that correspond to a given relation (hyper-/or hyponymy). Training data is collected by mapping Wikipedia articles to WordNet synsets and then marking links where a relation bet...
متن کاملThe People's Web meets Linguistic Knowledge: Automatic Sense Alignment of Wikipedia and WordNet
We propose a method to automatically alignWordNet synsets andWikipedia articles to obtain a sense inventory of higher coverage and quality. For eachWordNet synset, we first extract a set of Wikipedia articles as alignment candidates; in a second step, we determine which article (if any) is a valid alignment, i.e. is about the same sense or concept. In this paper, we go significantly beyond stat...
متن کامل